Large indexes of the web are usually maintained by automated programs commonly called web crawlers or robots. Lycos and WebCrawler for example have automated software which download, discover and explore web pages. Generally this is desirable: it means your site is more likely to be accessed because they can be found using a web search. But occasionally this is undesirable, especially if:The solution has been agreed upon by the people who write robots: the robots check for the existence of a file called robot.txt in the root of your Web site, and the robot.txt file contains a list of permitted/denied web pages on your site. For a full explanation of the format of the robots.txt file see: http://info.webcrawler.com/mak/projects/robots/norobots.html
- You are on a slow link and you don't want your whole site downloaded by a robot
- Your site contains information which you would rather was not indexed
- Your site contains large 'virtual' hierachies which would be time consuming or impossible for a robot to explore.
There is a robots.txt file in this directory which denies all robot access to your web site. The contents look like:
If you wish to prohibit all access you just need to put this file in the root diretory of your web site.# robots.txt file # # purpose to World Wide Web Robots trawling your web site (or parts thereof) # # For more info consult: # http://info.webcrawler.com/mak/projects/robots/norobots.html User-agent: * Disallow: /local/